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Every day we encounter dozens of people, and in order to interact with them appropriately 
we need to recognize their identity. The face is a crucial source of information to recognize 
a person's identity. However, recognizing the identity of a face is challenging because it 
requires distinguishing between very similar images (e.g., the front views of two different 
faces) while categorizing very different images (e.g., a front view and a profile) as the 
same person. Neuroimaging has the whole-brain coverage needed to investigate where 
representations of face identity are encoded, but it is limited in terms of spatial and 
temporal resolution. In this article, we review recent neuroimaging research that attempted 
to investigate the representation of face identity, the challenges it faces, and the proposed 
solutions, to conclude that given the current state of the evidence the right anterior 
temporal lobe is the most promising candidate region for the representation of face identity. 
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INTRODUCTION 

In this paper, we focus on recent neuroimaging research that 
has investigated aspects of the neural mechanisms underlying the 
perceptual recognition of face identity. The ability to recognize 
individuals is crucial for guiding behavior - it allows us to retrieve 
information about people and interact with them in appropriate 
ways. Many different cues can be used to recognize an individ- 
ual, including the appearance of the face, the sound of the voice, 
as well as the context in which we encounter a person and prior 
knowledge about his/her current general location (see Oliva and 
Torralba, 2007; Goesaert and Op de Beeck, 2013). A promising 
approach consists in studying how each of these cues is processed 
when other cues are controlled, to then proceed with an investiga- 
tion of how the different cues are integrated. Among the different 
cues that can be used for person recognition, the face is a cru- 
cial source of information and is usually sufficient in isolation 
to recognize a person's identity. However, recognizing face iden- 
tity is also computationally challenging: it requires discounting 
identity- irrelevant changes in sensory stimulation (such as changes 
in viewpoint and illumination) without losing the ability to per- 
form fine-grained discriminations needed to distinguish the faces 
of similar individuals. 

The earliest insights into the neural mechanisms underlying the 
ability to recognize face identity came from the study of patients 
with selective impairment for the recognition of faces (Charcot, 
1883; Wilbrand, 1892; Heidenhain, 1927; Jossmann, 1929), which 
was subsequently named prosopagnosia (Bodamer, 1947). Hecaen 
and Angelergues (1962) investigated the location of lesions pro- 
ducing selective deficits for faces in a group of 22 patients, and 
observed that prosopagnosic patients tended to have lesions in 
the right hemisphere, often involving occipital regions. A review 
of the neuropsychological literature individuated the right occip- 
itotemporal cortex as the most common location of the lesion in 



prosopagnosic patients (Meadows, 1974). Convergent evidence in 
support of the view that damage to the occipitotemporal cortex 
leads to prosopagnosia was reported in several studies (Whiteley 
and Warrington, 1977; Damasio etal, 1982; Malone etal., 1982). 

Other neuropsychological studies reported deficits for the 
recognition of familiar and famous faces in patients with herpes 
simplex encephalitis (Warrington and Shallice, 1984; Warring- 
ton and McCarthy, 1988) and semantic dementia (Snowden etal., 
2004), with more frequent face recognition deficits in the right 
than in the left temporal variant of semantic dementia (Thompson 
etal., 2003). These pathologies affect the anterior portions of the 
temporal lobe (Kapur etal, 1994; Mummery etal, 2000; Gitel- 
man etal., 2001; Hodges and Patterson, 2007; Noppeney etal., 
2007). Furthermore, the highest lesion overlap in patients with 
face recognition deficits was found the be in the right anterior 
temporal lobe (Tranel et al., 1997). Consistent with the neuropsy- 
chological literature, neuroimaging studies in healthy participants 
individuated regions showing stronger activity for faces than for 
other kinds of objects in occipitotemporal cortex [occipital face 
area (OFA) and fusiform face area (FFA); Sergent et al, 1992; Puce 
et al, 1996; Kanwisher et al, 1997; Gauthier et al, 2000; see Cukur 
et al., 2013 for an in-depth analysis of voxel response profiles] and 
the anterior temporal lobes (Rajimehr et al., 2009). 

Both occipitotemporal regions and anterior temporal regions 
show stronger activity for faces than other objects, and lesions 
in these regions lead to face processing deficits. What are the 
respective contributions of the two brain regions in represent- 
ing face identity? The finding that lesion to a brain region leads 
to a deficit for face recognition does not imply that that region 
encodes representations of face identity - it might just provide 
necessary input to another region that represents face identity. At 
the same time, neither occipitotemporal nor anterior temporal 
regions seem to be involved merely in the processing of "low level" 
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perceptual details. Patients with anterior temporal lesions have 
intact basic perceptual abilities (Warrington and Shallice, 1984), 
and while patients with occipitotemporal lesions often have visual 
field defects (Meadows, 1974), they are able to describe and draw 
individual face parts (Bodamer, 1947). A deeper understanding 
of the properties of representations in these regions is needed to 
clarify their respective roles for the recognition of face identity. 
This paper is concerned with the neuroimaging research pursuing 
this understanding. In particular, the focus is on perceptual repre- 
sentations of face identity, rather than on other aspects of person 
identity such as associated semantic knowledge (Tsukiura etal., 
2002), or the sense of familiarity and emotional responses which 
can be impaired in disorders such as Capgras syndrome (Ellis and 
Lewis, 2001). 

DISCRIMINATION OF FACE TOKENS 

Before delving into the discussion of the literature, it is nec- 
essary to introduce some terms and clarify their use. We will 
use the term "face token" to refer to a specific image of a 
face, seen from a particular viewpoint and under a particu- 
lar illumination. The recognition of face identity requires (1) 
to distinguish between face tokens that depict different people, 
and (2) to recognize when two different face tokens depict the 
same person. We will use the term "invariant face representa- 
tions" to refer to representations that encode information about 
whether two face tokens depict the same person, for some or 
all pairs of face tokens that depict a same person. Note that 
invariance can be partial, for example, there might be represen- 
tations that are invariant to changes in viewpoint of up to 35°. 
Therefore, not all invariant face representations are representa- 
tions of face identity. We will reserve the term "representation of 
face identity" for representations that encode information that 
allows determining that two face tokens depict the same per- 
son for all pairs of face tokens that are recognized as a same 
person by a human observer. Whether or not there exists one 
brain region that encodes representations with invariance across 
all transformations that humans can generalize across is an empir- 
ical question. To search for representations of face identity, we 
can first search for representations that distinguish between face 
tokens that depict different people, and then test whether and to 
which extent they are invariant. Finding brain regions that distin- 
guish between face tokens that depict different people provides us 
with a series of potential candidates for the representation of face 
identity. 

The investigation of regions that distinguish between face 
tokens that depict different people with functional magnetic 
resonance imaging (fMRI) is challenging, because when proper- 
ties like viewpoint and illumination are controlled, face tokens 
that depict different people do not produce significantly dif- 
ferent blood-oxygen-level dependent (BOLD) responses when 
analyzed with standard univariate approaches. Nonetheless, fMRI 
remains one of the best methods available to localize regions 
that distinguish between face tokens that depict different peo- 
ple. This is because it allows coverage of a large extent of the 
human brain in a single study, and because among the meth- 
ods with this property it is the one that offers the highest spatial 
resolution. 



For this reason, in the course of the past two decades, 
researchers used fMRI to investigate the neural mechanisms 
underlying the recognition of face identity, developing and 
employing experimental designs and data analysis approaches to 
meet the challenge posed by the subtle differences in the BOLD 
responses produced by different face tokens. 

One approach to individuating representations that distinguish 
between face tokens that depict different people involves using 
fMRI-adaptation (fMR-A). FMR-A is a phenomenon character- 
ized by reduced BOLD responses to repeated stimuli (Grill-Spector 
etal., 1999). FMR-A has also been observed during the presenta- 
tion of two stimuli that are not identical but are similar along some 
dimension (Grill-Spector et al., 1999; Vuilleumier et al, 2002). For 
example, fMR-A can occur for the presentation of different stim- 
uli from the same category (Fairhall et al., 20 1 1 ) . FMR-A has been 
used to investigate representations of face tokens in a series of 
studies (Grill-Spector etal., 1999; Gauthier etal, 2000; Rotshtein 
et al, 2004; Furl et al., 2007). Greater adaptation for repetitions of 
a same face token than for the presentation of different face tokens 
has been observed in the FFA (Gauthier etal., 2000), as well as in 
occipitotemporal regions defined with a broader contrast between 
faces and textures (Grill-Spector etal., 1999). 

As an alternative to fMR-A, some researchers have used mul- 
tivariate pattern analysis (MVPA) to improve the sensitivity 
of fMRI (Haxby etal, 2001; Haynes and Rees, 2006). Mul- 
tivariate approaches extract information from the pattern of 
activity in multiple voxels. They are more sensitive than uni- 
variate approaches, because they can distinguish between BOLD 
responses within a region that have the same mean but different 
spatial distributions. 

A common method consists in using univariate analyses in 
order to individuate regions showing stronger responses to faces 
than other objects ("face-selective" regions) and subsequently 
investigate information content with MVPA within these regions. 
With this regions-of-interest (ROI) approach it has been shown 
that face-selective regions, including notably the FFA, encode 
information about face tokens (Nestor et al., 201 1; Anzellotti et al., 
2013; Goesaert and Op de Beeck, 2013; Verosky etal., 2013; but 
see Natu etal., 2010). However, this approach is based on the 
implicit assumption that localizing the brain regions showing the 
greatest mean difference between the activity in response to faces 
and the activity in response to other objects exhaustively cap- 
tures the regions involved in the recognition of face identity. This 
assumption might not hold: there may be regions that do not 
show face-selectivity but still contribute to the recognition of face 
identity. 

An alternative to the use of face selectivity is searchlight anal- 
ysis (Kriegeskorte etal, 2006; Kriegeskorte and Bandettini, 2007) 
to individuate regions that distinguish between face tokens in the 
whole brain. In an early study (Kriegeskorte etal, 2007), search- 
light was used to detect information that distinguishes between 
face tokens in the right anterior temporal lobe. The faces that were 
distinguished, though, were of different genders. A more recent 
study (Nestor et al., 201 1) used searchlight and individuated infor- 
mation that distinguishes between face tokens of the same gender 
in the right anterior temporal lobe and posterior temporal cortex 
bilaterally. 
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Another method that can be used to individuate information 
that distinguishes between face tokens is recursive feature elimina- 
tion (RFE), a type of MVPA (De Martino etal, 2008; Formisano 
etal., 2008). RFE has advantages (and some disadvantages) with 
respect to both ROI-based and searchlight methods. RFE can indi- 
viduate information that is distributed beyond the extent of a 
searchlight sphere. It does not require that a set of contiguous 
voxels classify the different conditions significantly above chance; 
that is, informative voxels can be anywhere in the brain. This also 
means that feature selection approaches do not require making 
arbitrary choices about the size and shape of the regions within 
which to search for information. In addition, RFE requires that 
the individuated voxels contribute themselves to the discrimina- 
tion, while in the case of searchlight an individuated voxel does 
not necessarily contribute to the discrimination: as long as other 
voxels within the sphere provide significant classification accu- 
racy, the voxel will appear in the searchlight map, even if the 
voxel itself is not informative (this is especially true for SVM- 
based searchlight, see Etzel etal., 2013). The main disadvantage 
of RFE is that in its current form it allows localization of vox- 
els that contribute to a given classification, but unlike searchlight 
and representational similarity analysis (RSA) it does not allow 
localization of regions based on a match between a neural dis- 
similarity matrix and a dissimilarity matrix hypothesized by the 
experimenter. However, for the purpose of localization of regions 
involved in the representation of face tokens this is not a major 
concern. To date, RFE has produced promising results for the 
localization of regions that distinguish between face tokens that 
depict different people (Figure 1), allowing localization of infor- 
mative voxels for the discrimination between gender-matched 
faces in occipitotemporal and anterior temporal regions (Nestor 



etal, 2011; Anzellotti etal, 2013), and in the posterior cingulate 
and the posterior intraparietal sulcus (Anzellotti and Caramazza, 
2014). 

In sum, regions that distinguish between face tokens that depict 
different people have been found in occipitotemporal cortex bilat- 
erally, in the anterior temporal lobes, in posterior cingulate and in 
bilateral IPS. Very recent studies (Cowen etal., 2014; Nestor etal., 
2014) adopted principal component analysis (PCA) and indepen- 
dent component analysis (ICA) to investigate classification for 
larger numbers of face tokens, going beyond the small number 
of identities used in most studies to date. 

INVARIANT FACE REPRESENTATIONS 

Regions that distinguish between face tokens that depict differ- 
ent people are candidate regions for representing face identity, 
but not all of them necessarily encode representations of face 
identity. To individuate regions that represent face identity, it 
is important to investigate whether they encode invariant face 
representations. Studies investigating the invariance of face rep- 
resentations typically look for evidence of commonalities among 
representations of different face tokens that depict the same per- 
son. For this reason, it is particularly important to carefully 
control the stimuli used because the presence of commonali- 
ties in the low-level properties of different face tokens depicting 
a same person can lead to illusory invariance effects. Equating 
the average luminance, color and texture in the whole image is 
often insufficient as a control because visually responsive neu- 
rons at several stages of processing have local receptive fields 
that do not encompass the entire image. These challenges can 
be overcome by generating stimuli with computer graphics. Using 
computer graphics permits the careful control of the low-level 





FIGURE 1 | Brain regions encoding information that contributes to the 
classification between different face tokens corresponding to different 
individuals. vOcc, ventral occipital cortex; PTL, posterior temporal lobe; ATL, 



anterior temporal lobe; pCing, posterior cingulate; IPS, intraparietal sulcus. 
The current evidence indicates the right ATL, marked in green, as the most 
likely candidate region for encoding invariant representations of face identity. 
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differences between face tokens at a local level (Anzellotti et al., 
2013; Anzellotti and Caramazza, 2014). Since even cartoon 
faces elicit strong responses in face-selective neurons (Freiwald 
etal., 2009), it is unlikely that the use of realistic 3D render- 
ings of faces would bias the results with respect to the use of 
photographs. 

fMRI-adaptation can be used not only to individuate regions 
sensitive to differences in identity, but also to search for common- 
alities among representations of different face tokens that depict 
a same person. If a region encodes invariant face representations, 
the representations of different face tokens depicting the same per- 
son should overlap more than the representations of face tokens 
depicting different people, and therefore more fMR-A should be 
observed for the presentation of different face tokens that depict 
a same person than face tokens of different persons. One problem 
with the underlying assumptions motivating the use of fMR-A 
to study invariant face representations is that even if we accept 
that regions encoding invariant face representations should show 
fMR-A for the presentation of different face tokens depicting a 
same person, it does not follow that all regions that show fMR- 
A for the presentation of different face tokens depicting a same 
person encode invariant face representations. One way in which 
a region could show fMR-A for different face tokens depicting a 
same person despite encoding non-invariant face representations 
is through top-down influences. Via top-down influences, recog- 
nition of two different images as tokens depicting a same identity 
could lead to reduced activity not only in regions encoding invari- 
ant representations but also in early visual regions. Whether or 
not reduction in neural activity due to repetition can occur as a 
consequence of top-down influences is controversial (Xiang and 
Brown, 1998; Schendan and Kutas, 2003). 

Several studies investigated invariant face representations using 
fMR-A, with mixed results: some studies found evidence for adap- 
tation (Vuilleumier etal., 2002) while others did not (Pourtois 
etal, 2005). Ewbank and Andrews (2008) found fMR-A for rep- 
etition of face identity across different viewpoints in FFA when 
presenting familiar faces, but not when presenting novel faces. 
The likelihood of observing adaptation across different face tokens 
depicting a same person in fMR-A studies seems to be a function 
of the duration of the lag between two stimuli, with longer lags 
leading to more invariance in some studies (Andresen et al., 2009), 
but it remains unclear what are the mechanisms at the basis of this 
phenomenon. A recent study (Mur etal, 2010) found fMR-A for 
the repetition of face identity across different viewpoints in several 
regions, including early visual cortex. Given the current under- 
standing of representations in early visual cortex, it is unlikely that 
this region carries invariant face representations. Findings such as 
this suggest that fMR-A can occur due to top-down influences. 

To overcome the interpretative challenges that arise in fMR- 
A studies, invariant face representations have been investigated 
with MVPA. Experiments designed to investigate invariance with 
MVPA typically involve the presentation of multiple different 
tokens (e.g., different facial expressions, different viewpoints) of 
each face identity. The BOLD responses to those face tokens are 
then split into a subset used for the training of a classifier (for 
instance a support vector machine), and a subset used for the 
testing of the performance of the trained classifier. A possible 



approach is to split the data into subsets so that each part contains 
responses to all stimuli shown. In this case, the training and testing 
subsets contain the BOLD signal in response to different presen- 
tations of the same identical images. This analysis approach is not 
circular (data from different runs are used for the training and test- 
ing of classifiers), but since responses to the same images are used 
for training and testing, the classifier could potentially achieve sig- 
nificant classification accuracy relying on representations that are 
not invariant. 

Despite these remarks, a recent study (Nestor etal., 2011) used 
this approach and found accuracies significantly above chance in 
FFA but at chance in early visual cortex for the classification of 
face identity in the presence of different facial expressions (Nestor 
etal, 2011). The robust classification accuracies obtained in this 
study (Nestor et al., 20 1 1 ) are probably due to the contribution of 
invariant representations. However, other studies reported signifi- 
cant classification accuracy for faces seen from different viewpoints 
even in early visual cortex when using this method (Anzellotti et al., 
2013). This is in contrast with the current understanding of repre- 
sentations in early visual cortex, and suggests that the conclusions 
obtained with this method should be interpreted with caution. 

A more stringent method that overcomes the concerns dis- 
cussed above consists in splitting the data into subsets so that the 
responses to different viewing conditions are included in the train- 
ing and the testing set. In this case, the training and testing subsets 
contain the BOLD signal in response to different images. Using this 
method, classification across different viewpoints was at chance in 
early visual cortex, but was significant in other ventral stream 
regions (Anzellotti et al., 2013). In particular, even when using the 
responses to different stimuli for training and testing, and con- 
trolling carefully the "low-level" properties of images, significant 
classification generalizing across viewpoints was observed in both 
occipitotemporal and anterior temporal regions (Anzellotti et al., 
2013). However, significant classification does not directly imply 
that a region carries representations of identity. The extent to 
which representations are invariant to transformations may vary, 
and a brain region could show invariance for some image trans- 
formations that humans can generalize across, but not for others. 
According to our definitions, such a representation would count 
as an invariant representation, but not as a representation of face 
identity. 

Individuating significant classification accuracy across some 
specific transformations in multiple brain regions does not imply 
that the regions encode the same kind of representations. There- 
fore, occipitotemporal regions and anterior temporal regions 
might still encode different representations. To test this, a recent 
experiment investigated whether representations in different brain 
regions encoded information about face identity generalizing 
across different face halves (Anzellotti and Caramazza, 2014). For 
this manipulation, invariance was only found in the right anterior 
temporal lobe, and not in occipitotemporal cortex. 

In the process of generating increasingly invariant represen- 
tations, some information about identity- irrelevant differences 
between face tokens might be discarded or represented implicitly 
(DiCarlo and Cox, 2007). For this reason, the study of how and 
where identity- irrelevant information (e.g., information about 
viewpoint, illumination, and so on) is encoded can be seen as 
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a complementary investigation to the study of invariance. Sev- 
eral studies provide evidence that identity- irrelevant information 
declines moving from posterior to anterior regions in the ventral 
stream (Kietzmann etal., 2012; Anzellotti and Caramazza, 2014; 
see Freiwald and Tsao, 2010 for similar evidence in monkeys, and 
Yovel and Freiwald, 2013 for a discussion of issues of homol- 
ogy). However, some identity-irrelevant information might still 
be present in more anterior regions (DiCarlo and Maunsell, 2003; 
Kravitz etal., 2008). 

CONCLUSION 

Investigating the neural mechanisms underlying the recognition 
of face identity in humans is challenging, but the continuous 
development and improvement of design and analysis techniques 
has allowed the localization of representations that distinguish 
between face tokens depicting different people, and to begin to 
investigate their invariance. Given the current state of neuroimag- 
ing evidence, one region seems to encode face representations 
showing greatest invariance: the right anterior temporal lobe 
(Anzellotti et al., 2013; Anzellotti and Caramazza, 2014). This con- 
clusion is consistent with neuropsychological evidence of deficits 
for face recognition after damage to the right anterior temporal 
lobe (Tranel etal., 1997), and with electrophysiology studies in 
monkeys (Freiwald and Tsao, 2010). However, it is important to 
note that current evidence does not establish that the right ante- 
rior temporal lobe is the only locus of face identity recognition: 
bilateral deficits are frequent in the anterior temporal lobes, and 
thus it remains possible that the left anterior temporal lobe also 
contributes, although to a lesser extent, to the recognition of face 
identity. In previous studies, the anterior temporal lobes have been 
implicated in semantic knowledge (Hodges etal., 1992; Tsukiura 
etal., 2002; Patterson etal, 2007). Invariant face representations 
could play an important role to link perceptual inputs to semantic 
knowledge about people. 

Invariance does not appear only in the anterior temporal lobe, 
but builds up gradually, being present already to some extent in 
occipitotemporal regions (Kietzmann etal., 2012; Anzellotti etal., 
2013; see Freiwald and Tsao, 2010 for consistent electrophysiology 
findings in monkeys), suggesting different roles for occipitotem- 
poral and anterior temporal cortex for the recognition of face 
identity. 
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